Skip to content

simplify unpack logic #6908

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 6, 2021
Merged

simplify unpack logic #6908

merged 4 commits into from
May 6, 2021

Conversation

kaja47
Copy link
Contributor

@kaja47 kaja47 commented Apr 25, 2021

This is the first PR in an effort to speed up unpack function.
So far it mainly simplifies the code. There's also a small speedup by not doing byte-wise unpacking by php_unpack.

  • move endiannes check to compile time
  • remove php_unpack function
  • the compiler take care of sign extension

- move endiannes check to compile time
- remove php_unpack function
- the compiler take care of sign extension
@kaja47 kaja47 marked this pull request as draft April 25, 2021 12:23
@kaja47 kaja47 marked this pull request as ready for review April 25, 2021 13:53
Comment on lines 58 to 59
#define IS_LITTLE_ENDIAN (*(unsigned char *)&(uint16_t){1})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you reuse the WORDS_BIGENDIAN macro which is defined in Zend/zend_types.h ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure can.


if (type == 'q') {
v = (zend_long) v;
v = (int64_t) x;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Far from being a C expert but isn't the following:

Suggested change
v = (int64_t) x;
v = *(int64_t*) &x;

the correct way of type-punning? As IIRC signed/unsigned casts are UB if the value exceeds the range of a type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For casts between integers, the existing code is fine. What you may have in mind is a bitwise cast between float and int.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I double checked the standard and apparently unsigned to signed is UB when the value doesn't fit, but almost all compiler do two complement (according to https://unspecified.wordpress.com/category/c-3/ and if I read this correctly).

So should only be an issue on unicorn compilers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard (6.3.1.3 §3) mentions this conversion produce a result which is implementation-defined, not UB.
Is it OK to keep it like this?

map = little_endian_short_map;
v = (int16_t) x;
} else if ((type == 'n' && MACHINE_LITTLE_ENDIAN) || (type == 'v' && !MACHINE_LITTLE_ENDIAN)) {
v = php_pack_reverse_int32(x) >> 16;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add reverse_int16?

case 'v': { /* unsigned little endian */
zend_long v = 0;
uint16_t x;
memcpy(&x, &input[inputpos], 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nicer to load through unaligned_uint16_t instead? (Grep for existing typedefs...)

Copy link
Contributor Author

@kaja47 kaja47 Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean like this: uint16_t x = *((unaligned_uint16_t*) &input[inputpos])

Isn't in C access to char array through pointer of any other type forbidden?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really care about strict aliasing as long as it's not actively miscompiled ;) I'm okay with keeping the memcpy though, I expect it to compile down to about the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok then, I'll change it.


if (type == 'q') {
v = (zend_long) v;
v = (int64_t) x;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For casts between integers, the existing code is fine. What you may have in mind is a bitwise cast between float and int.

@nikic nikic merged commit efe79e0 into php:master May 6, 2021
@nikic
Copy link
Member

nikic commented May 6, 2021

Sorry, I forgot to merge this :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants